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Abstract 

This paper quantifies the fundamental limits of variable-length transmission of a general (possibly 
analog) source over a memoryless channel with noiseless feedback, under a distortion constraint. We 
consider excess distortion, average distortion and guaranteed distortion (d-semifaithful codes). In contrast 
to the asymptotic fundamental limit, a general conclusion is that allowing variable-length codes and 
feedback leads to a sizable improvement in the fundamental delay-distortion tradeoff. In addition, we 
investigate the minimum energy required to reproduce k source samples with a given fidelity after 
transmission over a memoryless Gaussian channel, and we show that the required minimum energy is 
reduced with feedback and an average (rather than maximal) power constraint. 

Index Terms 

Variable-length coding, joint source-channel coding, lossy compression, single-shot method, finite- 
blocklength regime, rate-distortion theory, feedback, memoryless channels, Gaussian channels, energy- 
distortion tradeoff. Shannon theory. 

I. Introduction 

A famous result of Shannon [1] states that feedbaek eannot inerease the eapaeity of memoryless 
channels. Burnashev [2] showed that feedback improves the error exponent in a variable-length 
setting. Polyanskiy et al. [3] showed that allowing variable-length coding and non-vanishing 
error-probability e boosts the e-capacity of the discrete memoryless channel (DMC) by a factor 
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of 1 — e. Furthermore, as shown in [3], if both feedback and variable-length coding are allowed, 
then the asymptotic limit is approached at a fast speed O as the average allowable 

delay i increases d 

{1 — e)\ogM*{£,e) = £C + O {\og£) (1) 

where M*{£,e) is the maximum number of messages that can be distinguished with error 
probability e at average delay £, and C is the channel capacity. This is in contrast to channel 
coding at fixed blocklength n where in most cases only a O convergence rate is attainable, 
even when feedback is available, see [3], [4]. Thus, variable-length coding with feedback (VLF) 
not only boosts the e-capacity of the channel, but also markedly accelerates the speed of approach 
to it. Moreover, zero-error communication is possible at an average rate arbitrarily close to 
capacity via variable-length coding with feedback and termination (VLFT) codes, a class of 
codes that employs a special termination symbol to signal the end of transmission, which is 
always recognized error-free by the receiver [3]. As discussed in [3], the availability of zero- 
error termination symbols models that common situation in which timing information is managed 
by a higher layer whose reliability is much higher than that of the payload. 

In [5], we treated variable-length data compression with nonzero excess distortion probability. 
In particular, we showed that in fixed-to-variable-length compression of a block of k i.i.d. source 
outcomes, the minimum average encoded length £*{k,d,e) compatible with probability e of 
exceeding distortion threshold d satisfies, under regularity assumptions, 

£*{k,d,e) = (1 - e)kR{d) -+0{\ogk) (2) 

where R{d) and V{d) are the rate-distortion and the rate-dispersion functions, and Q is the 
standard normal complementary cumulative distribution function. The second term in the expan¬ 
sion (2) becomes more natural if one notices that for Z V(o,i), 

E[Z.l{Z>0-‘(£)}] = ^e-^. (3) 

V 2tt 

As elaborated in [5], the expansion (2) has an unusual feature: the asymptotic fundamental limit 
is approached from the “wrong” side, e.g. larger dispersions and shorter blocklengths reduce the 
average compression rate. 

'Unless explicitly noted, all log and exp in this paper are to arbitrary matching base, which also defines units of entropy, 
information density and mutual information. 
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In this paper, we eonsider variable-length transmission of a general (possibly analog) source 
over a DMC with feedback, under a distortion constraint. This variable-length joint source- 
channel coding (JSCC) setting can be viewed as a generalization of the setups in [3], [5], which, 
as explained above, analyze the problem without source coding and channel transmission, respec¬ 
tively. Related work includes an assessment of nonasymptotic fundamental limits of fixed-length 
JSCC in [ 6 ]-[ 8 ], a dynamic programming formulation of zero-delay JSCC with feedback in [9], 
and a practical variable-length almost lossless joint compression/transmission scheme in [ 10 ]. 
Various feedback coding strategies are discussed in [11]-[17]. Practical feedback communication 
schemes in the literature that implement VLF include [18]-[22]. 

We treat several scenarios that differ in how the distortion is evaluated and whether a termi¬ 
nation symbol is allowed. In all cases, we analyze the average delay required to achieve the 
objective. The results in Section III, where as before, C, R{d),V{d) denote the channel capacity, 
and the source rate-distortion and rate-dispersion functions, respectively, are summarized as 
follows: 

• Under the average distortion criterion, E [d(S'^, < d, the minimal average delay i*{k, d) 

attainable via VLF codes transmitting k source symbols satisfies 

t{k,d)C = kR{d) + 0{logk). (4) 

• Under the excess distortion probability criterion, P[d(S'^, S'^) > d] < e, the minimal average 
delay attainable via VLF codes transmitting k source symbols satisfies 

t{k, d,e)C={l- e)kR{d) - \ + Q Qog _ ( 5 ) 

V ZTT 

• Under the guaranteed maximal distortion criterion, P[d(S'^,^^) > d] = 0, the minimal 
average delay attainable via VLFT codes transmitting k source symbols satisfies 

e*{k,d,0)C = kR{d) + 0 {log k). (6) 

Similar to (1), approaching the limits in (4), (5) and ( 6 ) only requires an extremely thin 
feedback link, namely, the decoder sends just a single acknowledgement signal once it is 
ready to decode (stop-feedback)^. Note that (5) exhibits significant similarities with (2): the 

^Stop-feedback is not to be confused with the termination symbol, which is a special symbol that the encoder can transmit 
error-free in order to tell the decoder that the transmission has ended and it is time to decode. 
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asymptotic limit is approached from below, i.e. in eontrast to the results in [6], [23], [24], 
smaller bloeklengths and larger souree dispersions are benefieial. Note also that the first term of 
the expansion in (5) ean be attained with variable-length eodes without feedbaek. 

Interestingly, naive separated souree/ehannel eoding fails to attain any of the limits mentioned. 
For example, it approaehes the asymptotie fundamental limit from above, e.g. even the sign of the 
seeond term in (5) is not attainable. This observation led us to believe, initially, that eompetitive 
sehemes in this setting should be of sueeessive refinement and adaptation sort sueh as in [25], 
[26], or dynamie programming-like as in [9], [27]. It turns out, however, that like the fixed- 
length JSCC aehievability sehemes in [6], [7], attaining limits (4)-(6) requires a rather simple 
variation on the separation arehiteeture: one only needs to allow a variable-length interfaee 
between the souree eoder and the ehannel eoder. Note that typieally, separation is understood 
in the sense that the output of the souree eoder (eompressor) is treated as pure bits, whieh 
ean be arbitrarily permuted without affeeting performanee of the eoneatenated seheme [8], [28]. 
Similarly, the performanee of a variable-length separated scheme is insensitive to permutations 
(but not additions or deletions) of the bits at the output of the souree eoder. These semi-joint 
aehievability sehemes are the subjeet of Seetion II. They form the basis for the lossy joint 
souree-ehannel eodes, whieh are the subjeet of Seetion III. 

Energy-limited eodes are the subjeet of Seetion IV. The optimal energy-distortion tradeoff 
aehievable in the transmission of a general souree over the AWGN ehannel is studied in See¬ 
tion V. In that setting, disposing of the restrietion on the number of ehannel uses per souree 
sample, we limit the total available transmitter energy E and we study the tradeoff between 
the souree dimension k, the total energy E and the fidelity of reproduetion. Related prior work 
ineludes a study of asymptotie energy-distortion tradeoffs [29] and a nonasymptotie analysis of 
energy per bit required to reliably send k bits through an AWGN ehannel [30]. The main results 
in Seetion V are the following: 

• Under average distortion eonstraint, the total minimum energy required to transmit k souree 
samples over an AWGN ehannel with feedback satisfies 

E}{k,d) ■ = kR{d) + 0{\ogk) , (7) 

Ao 

where ^ is the noise power per degree of freedom. 

• Under excess distortion probability eonstraint, the minimum energy required to transmit k 
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source samples over an AWGN ehannel without feedback satisfies 

E* {k, d, e) • ^ = kR{d) + ^/k {2R{d) \oge +V{d))Q-^ (e) + O (logk) (8) 
No 

• Under excess distortion probability eonstraint, the total minimum average energy^ required 
to transmit k source samples over an AWGN ehannel with feedback satisfies 

b; (k, d..) ■ = kami - 0 - + o (log k) (9) 

Like (5), partieularizing (9) to e = 0 also eovers the ease of guaranteed distortion. The first 
term in the expansion (9) can be aehieved even without feedback, as long as e > 0 and the 
power eonstraint is understood on the average over the eodebook. 

We point out the following parallels between the variable-length codes and energy-limited- 
eodes. 

• Under average distortion, in both oases the fundamental limit is approaehed at speed O 
(cf. (4), (7)). 

• Allowing a non-vanishing exeess-distortion probability and variable length (or variable 
power) boosts the asymptotio fundamental limit by a faetor of 1 — e. 

• Allowing both feedbaok and variable length (or power) leads to expansions (5), (9), in whioh 
shorter blooklengths are benefioial. 

• As long as feedbaek is available, in both variable length ooding with termination and average 
energy-limited ooding, guaranteed distortion (e = 0) oan be attained, even though the ehannel 
is noisy. 


II. Feedback codes eor non-equiprobable messages 

In this seetion we oonsider joint source-ehannel eoding assessing reliability by the probability 
that the (possibly non-equiprobable) message is reprodueed eorreetly. These results lay the 
foundation for the analysis of joint souroe-ohannel coding under distortion oonstraints presented 
in Seetion III. Our key tool will be two extensions of the ehannel eoding bounds for the DMC 
with feedbaek from [3]. VLF and VLFT codes are formally defined as follows. 


^The energy constraint in (8) is understood on a per-codeword basis. The energy constraint in (9) is understood as average 
over the source and noise realizations. 
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Definition 1. A variable-length feedback code (VLF) transmitting message W (taking values in 
W) over the channel with input/output alphabets A/B is defined by: 

1) A random variable U E U revealed to the encoder and decoder before the start of the 
transmission. 

2) A sequence of encoding functions W x W x A, specifying the channel inputs 

X^ = ^n{U,W,Y^-^). (10) 

3) A sequence of decoding functions gn- U x B'^ ve- W, n = 1, 2,... 

4) A non-negative integer-valued random variable r, a stopping time of the filtration Tn = 

The final decision W is computed at the time instant r 

W = gr{U,Y^) (11) 

The value E [r] is the average transmission length of the given code. 

A very similar concept is that of an VLFT code: 

Definition 2. A variable-length feedback code with termination (VLFT) transmitting W E W 
over the channel with input/output alphabets A/B is defined similarly to VLF 

codes with the exception that condition 4) in Definition 1 is replaced by 
4’) A non-negative integer-valued random variable r, a stopping time of the filtration Qn = 

The idea of allowing r to depend on the true message W models the practical scenarios 
(see [3]) where there is a highly reliable control layer operating in parallel with the data channel, 
which notifies the decoder when it is time to make a decision. 

The two special cases of the above definitions are stop-feedback and fixed-to-variable codes: 
1) stop-feedback codes are a special case of VLF codes where the encoder functions {fn}^=i 
satisfy: 

UU,W,Y^-^) = UU,W). (12) 

Such codes require very limited communication over feedback: only a single signal to stop 
the transmission once the decoder is ready to decode. 
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2 ) fixed-to-variable codes, defined in [31], are also required to satisfy (12), while the stopping 
time is"^ 

T = mi{n>l:gn{U,Y^) = W], (13) 

and therefore, sueh eodes are zero-error VLFT eodes. 

For both VLF and VLFT eodes, we say that a eode that satisfies E [r] < £ and ¥ W < 

e, when averaged over U, message and ehannel, is an (£, e) code for source/channel (VF, {-Py.|x»y»-i }“i) • 
The random variable 1/ serves as the common randomness shared by both transmitter and 
receiver, which is used to initialize the codebook. As a consequence of Caratheodory’s theorem, 
the amount of this common randomness can always be reduced to just a few bits: as shown in 
[3, Theorem 19], if there exists an (£,e) code with \U\ = oo, then there exists an (£, e) code 
with \U\ < 3. Allowing for common randomness does not affect the asymptotic expansions, but 
leads to more concise expressions for our non-asymptotic achievability bounds. Furthermore, no 
common randomness is needed at all if the channel is symmetric [3, Theorem 3]. 

First, we quote an achievability result [3, (107)-(118)]. Let Py be the capacity achieving output 
distribution of the DMC. Denote information density as usual: 


, ,, A PY\x{h\a) 


(14) 


Theorem 1 ([3]). For every DMC with capacity C, any M and probability of error e there exists 
an {£, e) stop-feedback code for the message W taking M values^ such that 

1 


where 


C£ < log M -f log - -f Oo 
e 


Oo = maxzx;y(a:; y)- 

x,y 


(15) 


We note that a similar result is also contained in many other works, starting from [2] and 
later in [32], [33]. 


'’As explained in [31], this model encompasses fountain codes in which the decoder can get a highly reliable estimate of r 
autonomously without the need for a termination symbol. 

^Although the result in [3] is stated for average (over equiprobable messages) error probability, it in fact applies to arbitrary 
distribution of messages, so W need not be equiprobable. Furthermore, if we do not insist on |Zi | < 3, Theorem 1 even applies 
to maximal probability of error. 
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Next, we tighten Theorem 1 in the ease of non-equiprobable messages. 

Theorem 2. For every DMC with capacity C and random variable W there exists an (^, e) 
stop-feedback code for W with 

Ci<H{W) + \og-^+ao (16) 

where oq is defined in (15). 


Proof: If H{W) = cxo then there is nothing to prove, so we assume otherwise. Denote the 
information in m 


iwim) = log 

Pw{m) 

and note that by the memorylessness assumption in (18), 


(17) 


fc”) = ^ (18) 

i=l 

In the aehievability seheme of [3, Theorem 3], at time n the deeoder observes Y^, eomputes M 
information densities Y"'), ..., Y'^), where Q,..., are the eodewords, 

and stops onee any of them exeeeds a threshold. Instead of one eommon threshold, we assign 
lower thresholds to the more likely messages. 

Code construction: We define the eommon randomness (revealed to the eneoder and deeoder 
before the transmission starts) to be a random variable U as follows: 


Pif = Px°° X ... X Px°° (19) 

'-v-' 

|W| 

where eonsists of i.i.d. eopies drawn from (any) eapaeity-aohieving input distribution. A 
realization of U defines \yV\ infinite dimensional veetors G m G W. Having observed 
that the value of W equals ttiq g W, the eneoder transmits ehannel symbols Xn as follows: 

X,, = f„(mo) = (20) 

At time n, the deeoder eomputes the values 


4(m) = zx";y"(fn(m.);F”) -zve(m), 


( 21 ) 


The deeoder defines the stopping times: 


Tm = mf{n > 0 : /n(m) > 7 } 


( 22 ) 
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where 7 > 0 is an arbitrary eonstant. The final deeision W is made by the decoder at the 
stopping time r*: 


T* = min Tm 

(23) 

meW 

W = g{Y'^*) = argminTm . 

(24) 


mSW 

where the tie-breaking rule is immaterial. 

Analysis: We claim that, averaged over U, we have: 

P[Vl/ ^ ly] < exp(- 7 ) (25) 

CE [r*] < H{W) + 7 + ao . (26) 

Abbreviate the stopping time of the true message as 

r = (27) 

The union bound results in, cf. [3, Theorem 3]: 

P[M/ ^ (y|VP = mo] < ^ ^1^ = "^ 0 ] (28) 

m,eW\{mo} 

(29) 

Due to memorylessuess of the channel, — nC is a martingale with jump size 

bounded by oq (defined in (15)). For each j G N, min{r, j} is a bounded stopping time, so by 


Doob’s optional stopping theorem ( [34, Theorem 10.10]) we have 

CE [min{r, j}] - H{W) = E [lrain{r,j}{W)] < 7 + ao, (30) 

By the monotone convergence theorem, it follows from (30) that 

C'E[r] - i7(iy) < 7 + ao. (31) 

Therefore, the stopping time r has bounded expectation, and since the martingale Y^) — 

nC also has bounded increments, Doob’s optional stopping theorem applies to conclude 

E[^x7y^(2f";r")] = C'E[r]. (32) 

Next, we have, by a change of measure argument, for every m 7 ^ mo: 

= n\W = mo] < exp{—= n\W = m] . (33) 
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Consequently, using (33), we have 



nrm<r\W = 

mo] < P[rm < oo|kk = mo] 

(34) 


CX) 

= ^P[r,„ = n\W = mo] 

n=0 

(35) 


< exp{-zw(m) - 7 } , 

(36) 


where we used that P [r < cx)] = P [r^ < oo\W = m] = 1, whieh in turn follows from (31). 
Summing (36) over all m ^ mo and using (28) we get (25). Note that the reasoning in (33)-(36) 
generalizes that in [3, (111)-(118)] to nonequiprobable messages. 

The estimate of average length in (26) follows from r* < r and (30). The result (16) follows 
by taking 7 = log i. ■ 

Remark 1. A slightly less sharp bound eould also be derived via a variable-length separated 
scheme: eompress W losslessly into a variable-length string {0,1}* with average length less 
than H{W), ef. [35], then send the length via 0{\ogH{W)) ehannel symbols and very small 
probability of error and finally send the data bits. 

Next, we extend the zero-error bound in [3, Theorem 11] to the case of non-equiprobable 
messages: 

Theorem 3. For every DMC with capacity C there exists a constant ai such that for every 
discrete random variable W there exists an [i, 0) VLFT code with 

Ce < H{W) + ai (37) 


Proof: Without loss of generality, we assume that H{W) < 00 and W takes values in 
positive integers. The codebook is a countable collection of infinite strings C^, m = 1,2,.... 
Given the codebook and the realization of kk = mo, the encoder sends 

= P(mo) = (38) 

The decoder outputs m error-free at the stopping time r* given by: 

r* = inf 14(m) > max 4(j) I , (39) 

n I j^m \ 
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where 


4 (m) = ;y”) -zw{m). 


(40) 


Aeeording to (39), if the true message is m the transmission stops at the first instant n when 
In{m) exeeeds all In{j), j 7 ^ 'm-- Note that r depends on the transmitted message m, as permitted 
by the paradigm of VLFT codes. 

Analysis: 


P[r* > n\W 


m] = P 


U {InU)>In{m)}\W = m 


(41) 


Applying the random coding argument, we now assume that the codebook strings Cf', C^,... 
are drawn i.i.d. from Px^ = Px x Px x ..where Px is the capacity-achieving channel input 
distribution. Denoting by X" an independent copy of and taking the expectation of the right 
side of (41) with respect to the codebook, we have 


P 


< E 


< E 


U 

min 


mm 


> 1 [■ \W = m 


Pw(j)PYlx(Y\x;, 
Pw{m)PY\x{Y\Xm 

Pw{j)Py-\x^{Y\X^) 


i.y^p 

i=i 


i.E 


-Pw (^)-Py" IX " (E” IX "■) 
F^(j) E[Py„|x.(ElX")|E"] 


> 11 x^,y" 


f^^Pwim) Py.|x-^(y-|X-) 
= E [exp(-|zx-;y"(^”; E") - «M/(m)|+)] , 


(42) 

(43) 

(44) 


where we used the union bound and min{l, a] = exp [log i|^j. Applying (44) we get 

CX> 

E[r*] = ^P[r* > n] (45) 

oo 

< 5^E [exp(-|zx.;y.(X";y") -*H^(1X)|+)] . (46) 

n =0 

Finally, (46) implies (37) by applying the result [3, (162)-(179)]: 

CX) 

5^E[exp(-|^x^y™(X";F")-7|+)] < ^ + «i • (47) 

n=0 
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III. Asymptotic expansions of the rate-distortion tradeoff 
A. Definitions 

We move from the setup of Seetion II where a diserete message is transmitted over the ehannel 
with feedbaek to a more general seenario, in whieh a, possibly analog, signal is transmitted over 
a ehannel with feedbaek, under a fidelity eonstraint. We will eonsider the following seenarios: 

1) excess distortion probability: A VLF eode transmitting memory less source G 5^ with 
reproduction alphabet and separable distortion measure d: x 5^ i—)■ [0, -f cxo] is called 


a (/c, i, d, e) excess-distortion code if the decoding time and the distortion satisfy 

E [r] < £ (48) 

P[d(5^^^) >(i] < e (49) 

The corresponding fundamental limit is 

i*{k, d, e) = mi{l : 3 an {k, d, e) VLF code} . (50) 

2) average distortion: A VLF code satisfying, instead of (49), an average constraint 

E[d(^^5^)]<d (51) 

is called a {k, £, d) average-distortion code. The corresponding fundamental limit is 

i*{k,d) = 3 an {k,£,d) VLF code} . (52) 


3) guaranteed distortion: A VLFT code transmitting memoryless source G with re¬ 
production alphabet and separable distortion metric d is called a (/c,£, d, 0) guaranteed- 
distortion code if it achieves e = 0 in (49). The corresponding fundamental limit is 

£*{k, d, 0) = ini{£: 3 an {k, £, d, 0) VLFT code} . (53) 

We will use the following notation for the various minimizations of information measures: 

R 5 (d) = min _ /(^; Z) (54) 

Pz\S '■ : 

E[d(S,Z)]<d 

Ms(d,e)- min _ I(S;Z) (55) 

P[d(S,Z)>d]<E 

Hd,e{S) = min H{c{S)). (56) 

c<SI— 

F[d{S,c{S))>d]<e 
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The quantity in (56) is referred to as the (d, e)-entropy of the source S [36]. The {d, 0)-entropy 
is also known as epsilon-entropy [36]:® 

= min H{c{S)). (57) 

c; tSi— 

di{S,c{S))<d a.s. 

Provided that the infimum in (54) is achieved by some transition probability kernel Pz*\s, the 
d-tilted information in s G 5 is defined as [24] 


js{s,d) = -logE[exp(-A*d(s, Z*) + X*d)] 

where^ 

A* = -R'sid). 


(58) 

(59) 


In almost-lossless compression, Z* = S and 

Jsis,d) = Z5(s). 


(60) 


B. Regularity assumptions on the source 

We assume that the source, together with its distortion measure, satisfies the following as¬ 
sumptions: 

A1 The source {Si} is stationary and memory less, = Ps x ... x Ps- 

A2 The distortion measure is separable, d(s^, z’^) = ^ J2i=i 

A3 The distortion level satisfies dmm < d < dmax, where dj^m is the infimum of values at which 
the minimal mutual information quantity Rs{d) is finite, and dmax = inf^g^ E [d(S, z)], where 
the expectation is with respect to the unconditional distribution of S. 

A4 The rate-distortion function is achieved by a unique Pz*|s' = -^(S; Z*). 

A5 E [d^^(S, Z*)] < oo where the expectation is with respect to Ps x Pz*. 

The rate-dispersion function of the source satisfying assumptions A1-A5 is given by [24] 

V(d) = Var(js(S,d)). (61) 


We showed in [5] that under assumptions A1-A5 for all 0 < e < 1 

R 5 fc((i, e 


HdAS 


I \ Ztt 


(62) 


®N.B. in that terminology “epsilon” corresponds to d, not e. 

^Note that the existence of Pz*\s guarantees the differentiability of Rs(d). 
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C. Average distortion 

Theorem 4. Under assumptions A1-A5 we have 

Ct{k,d) = kR{d) + 0{logk). (63) 


Proof: 

Achievability: fix 1 < p < 12, e > 0, source codebook ci,..., cm and consider a separated 


scheme ^ - f(5) - X - Y -W - c{W) such that P f(5) ^ W 


< € and 


f(.^) 

= arg min d(s^,Cm), ^ 

(64) 

c{m) 

= Cm, 

(65) 


The average distortion is bounded by 


E[d(5^c^)] <E[d(^^c,^.)] +E d{S^c^)l{Wj^W 


< E 


min d{S’^,Cm) 


+ (E [d^{S^c^)]ye^--. 


( 66 ) 

(67) 


where (67) holds by Holder’s inequality. Taking the expectation of both sides over ci,..., cm 
drawn i.i.d. from we conclude, via a random coding argument, that there exists a separate 
source/channel code with average distortion bounded by 


d<d+{E [d^’(^^Zl)])^e^■^ 


( 68 ) 


where we denoted 


d = E 


min d(S'^,Zm) , 


(69) 


and PskZi...ZM ^ • • -Pz*- The work of Pile [37] (finite alphabet) and Yang and Zhang 

[38] (abstract alphabet) shows that under assumptions Al-A5,^ 


logM = kR{d) + 0{logk). (70) 

p 

Letting e = kp-^, we conclude by assumption A5 that the second term in (68) is bounded by 
O (i), i.e. d < d + 0 (i). Finally, by Theorem l,Ci < logM + 0 (log A;), and the ‘<’ direction 
in (63) follows by the differentiability of R{d) in the region of assumption A3. 


*In particular, the bounded twelfth moment in A5 is required for the applicability of the result of Yang and Zhang [38]. 
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Converse: By the data-processing inequality and [2, Lemma 1-2] we have 

kR{d) <eC (71) 

for any (/c, f, d) VLF eode. ■ 

D. Excess distortion probability 

Theorem 5. Under assumptions A1-A5 and any e > 0 we have 

t{k, d,e)C={l- e)kR{d) - \ + O (log k) (72) 

V 27r 


Proof: Achievability: Pair a lossy eompressor ^ W with exeess-distortion probability 
e' = e — ^ and H(W) = with a VLF eode from Theorem 2 transmitting W with 

probability of error Apply (62) to (16)''’. 

Converse: Apply the data-proeessing inequality and [2, Lemma 1-2] to get: 

£C>M5fc(d,e) (73) 


for every {k, £, d, e) VLF code. ■ 

E. Guaranteed distortion 

Theorem 6. Under assumptions A1-A5, we have 

t^{k,d,£))C = kR{d) + 0{\ogk) (74) 


Proof: For the achievability we note that the estimate of the Hd,e{S’^) in (62) applies with 
e = 0 and thus 

Hd,o{S^) = kR{d) + 0{\ogk). (75) 

^Although the optimal mapping c* that achieves (d, e)-entropy is not known in general, the existence of good approximations 
satisfying the constraint in (56) that approach HdRS^) to within logj HdRS^) bits is guaranteed by arandom coding argument, 
see [5]. 

*°Note that (62) also holds if e in the left side is replaced by e + O ( -L ). 
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Then, we ean pair the mapping aehieving with the zero-error VLFT eode from Theo¬ 

rem 3. 

Conversely, repeating the argument of [3, Theorem 4], with the replacement of the right side 
of [3, (67)] by e) we conclude that any {£, d, e) VLFT code must satisfy 

^s{d, e) < C£ + \og{£ -f 1) -f log e. (76) 


F. Discussion 


We make several remarks regarding the rate-distortion tradeoff in all three settings considered 
above: 

1) The case d = d^in is excluded by the assumptions of Theorems 4-6. However, in the 
important special case of a distortion measure that satisfies 


d(a, b) 


{ drain) ® b 

^ drnin) CL ^ b 


(77) 


d = drain coixesponds to almost-lossless transmission, and both Theorems 4 and 5 apply 
with R{d) and V{d) equal to the entropy and the varentropy of the source, respectively, as 
long as the source is stationary and memoryless and the third moment of ^s(S) is finite. 

2) For almost-lossless transmission of finite alphabet sources, the asymptotic expansion (72) 
can be achieved by reliably (i.e. with probability of error ~ sending through the channel 
the type of the source outcome first, and then reliably sending each message whose type is 
one the most likely types with total mass 1 — e. 

3) Even if the channel is not symmetric, the asymptotic expansions in Theorems 4-6 can be 

achieved without common randomness U, by using constant composition channel code¬ 
books. For instance, consider the scheme in Theorem 2 with drawn from the distri¬ 
bution equiprobable over the capacity-achieving type. Since E = E [r] a.s., almost 

every such codebook attains the bound in (30) up to logarithmic terms, resulting in a 
deterministic construction attaining (72). 

4) Stop-feedback codes are remarkably powerful at finite blocklength; indeed, up to the terms 
of order O (log A:), they attain the fundamental limits in the settings of Theorems 4, 5 and 
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6. As the converse parts of Theorems 4, 5 and 6 demonstrate, relying on feedback more 
heavily can only bring in an improvement of order at most O (logk). 

5) Note that (72) is achieved by a stop-feedback code. We can further show that even without 
any feedback one can still achieve the optimal first order performance 

£C <{1- e)kR{d) + 0{^yk\ogk), (78) 

provided variable-length channel coding is allowed. Indeed, one can first use the variable- 
length excess-distortion compressor from [5] on to get a binary string of average length 
{l-e)kRid)+0{y/k), see (2). Then, truncating the length at /c^ and transmitting 2 log k data 
bits with reliability we can reliably inform the encoder about the total number of data 
bits b to be sent next. We may then use a capacity-achieving code of length ^ + OiVMRib) 
to send the data bits with reliability ^ [23]. 

6) The naive separation scheme, i.e. a fixed-length source code followed by a channel code 
achieves at most: 

£C > {1 — e)kR{d) + a\/k\ogk, a > 0. (79) 

Indeed, according to Theorem I, the number of messages M that can be transmitted via a 

VLF code with error probability rj satisfies 

on 

log M >-+ O (log R). (80) 

\ — T] 

On the other hand, the number of codewords of a source code with probability of exceeding 
distortion d no greater than ( satisfies [24] 

logM <kR{d) + ^/kV(^Q-^{C) + 0{l). (81) 

Optimizing over rj + (^ < e yields (79). 

7) The semi-joint separated schemes that attain (72) contain a vital ingredient missing from 
naive separated schemes: namely, the channel code employs unequal error protection. Con¬ 
sequently, the more likely source codewords are decoded with higher reliability, resulting in 
massive improvement at finite blocklength evidenced by (72). Unequal error protection can 
be achieved ether via a maximum-a-posteriori-like decoder of Theorem 2 or the variable- 
length separation interface of Remark 1 . 

8) The Schalkwijk-Bluestein [22] (see also [39]) elegant linear feedback scheme for the trans¬ 
mission of a single Gaussian sample S ~ AA(0, cr^) over the AWGN channel achieves the 
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2 

mean-square error after n ehannel uses, where P is the average transmit SNR. In 

other words, the minimum delay in transmitting a Gaussian sample over a Gaussian ehannel 
with feedbaek is given by 




R{d) 

~C~ 


(82) 


as long as is integer.^' Note that (82) is achieved with fixed, not variable length, 
and average, not maximal, power constraint. If there are k Gaussian samples to transmit, 
repeating the scheme for each of the samples achieves 

t(k,d) = k^, (83) 


which implies, in particular, that in general our estimate of O (log k) in (63) is too con¬ 
servative. Beyond Gaussian sources and channels, a sufficient condition for a fixed-length 
JSCC feedback scheme to achieve (83) is provided in [16]. 

9) The Schalkwijk-Bluestein scheme uses instantaneous feedback and has notoriously resisted 
generalization beyond Gaussian channels, which limits the applicability of the scheme. In 
contrast, the simple separated scheme in Theorem 4 uses only stop-feedback and applies 
to arbitrary sources and channels. 


IV. Energy-limited feedback codes for non-equiprobable messages 

In this section, we study the transmission of a message over an AWGN channel under an 
energy constraint. We would like to know how much information can be pushed through the 
channel, if a total of E units of energy is available to accomplish the task. Formally, the codes 
studied in this section are defined as follows. 


Definition 3. An energy-limited code without feedback for the transmission of a random variable 
W taking values in W over an AWGN channel is defined by: 

1) A sequence of encoders f„: W A, specifying the channel inputs 

Xn = fn {W) (84) 


satisfying 


P 


.i=i 




= 1 


(85) 


"if R{d) = C, no feedback is needed. 
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2) A decoder g: B°° W. 


Definition 4. An energy-limited feedback code for the transmission of a random variable W 
taking values in W over an AWGN channel is defined by: 

1) A sequence of encoders f„: W x A, defining the channel inputs 

X^ = fn{W,Y^-^) (86) 

satisfying 

OO 

J2^[X]]<E (87) 

i=i 

2) A decoder g: B°° W. 

An {E, e) code for the transmission of random variable W over the Gaussian ehannel is a 
eode with energy bounded by E and F W W < e. 

Definitions 3-4 do not impose any restrietions on the number of degrees of freedom n, re- 
strieting instead the total available energy. The problem of transmitting a message with minimum 
energy was posed by Shannon [40], who showed that E^, the minimum energy per information 
bit eompatible with vanishing bloek error probability eonverges to Nq logg 2 as the number of 
information bits goes to infinity, where is the noise power per degree of freedom. Recently, 
Polyanskiy et al. [30, Theorem 7] showed a dynamie programming algorithm for the error-free 
transmission of a single bit over an AWGN ehannel with feedback that attains exactly Shannon’s 
optimal energy per bit tradeoff 

Eb = Nq logg 2. (88) 

The next non-asymptotie aehievability result leverages that algorithm to transmit error-free a 
binary representation of a random variable over the AWGN ehannel by means of a variable- 
length separate eompression/transmission seheme. 

Theorem 7. There exists a zero-error feedback code for the transmission of a random variable 
W over the AWGN channel with energy 

E 


Nr 


loge < H{W) + 1. 


Conversely, any {E,0)-feedback code must satisfy 

H{W) <^\oge. 

1\q 


(89) 


(90) 
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Bit number 

Sequence of time slots 

1 

1 

24 7 

2 

3 

5 8 ■■■ 

3 

6 

9 ... 

i(W) 




Fig. 1. Illustration of the diagonal numbering of channel uses in Theorem 7. 


Proof: The encoder converts the source into a variable-length string using the Huffman 
code, so that the codebook is prefix-free and the expectation of the encoded length ^(W) is 
bounded as 

W.[l{W)]<H{W) + l. (91) 

Next each bit (out of i.{W)) is transmitted at the optimal energy per bit tradeoff A^o loge 2 using 
the zero-error feedback scheme in [30, Theorem 7]. Transmissions corresponding to different bits 
are interleaved diagonally (see Fig. 1): the first bit is transmitted in time slots 1, 2,4, 7,11,..., the 
second one in 3, 5, 8, 12,..., and so on. The channel encoder is silent at those indices allocated 
to source bits i(W) + 1,£(VF) + 2,... For example, if the codeword has length 2 nothing is 
transmitted in time slots 6, 9,13,.... The receiver decodes the first transmitted bit focusing on 
the time slots 1, 2,4, 7,11,... It proceeds successively with the second bit, etc., until it forms 
a codeword of the Huffman code, at which point it halts. Thus, it does not need to examine 
the outputs of the time slots corresponding to information bits that were not transmitted, and in 
which the encoder was silent. 

Since the scheme spends Nq logg 2 energy per bit, the total energy to transmit the codeword 
representing W is 

i{W)No\og,2. (92) 

Taking the expectation of (92) over W and applying (91), (89) follows. 

In the converse direction, due to the zero-error requirement and data processing, H{W) = 
I{W] g(y°°)) < y°°), which in turn is bounded by the right side of (90) [40]. ■ 

Our next achievability result studies the performance of a variable-length separated scheme. 
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Theorem 8. Fix positive Ei and E 2 such that 

El + E 2 < E. 


(93) 


Denote 


e{E, m) = 1 


\/7riVo J-a 



m—1 


'^0 dx. 


(94) 


Assume that W takes values in {1,2,..., M}. There exists an {E, e) non-feedback code for the 
transmission of random variable W over an AWGN channel without feedback such that 


e<E[e{Ei, H^)] + 5 (^ 2 , [logs MJ + 1). 


(95) 


Proof: Assume that the outeomes of W are ordered in deereasing probabilities. Consider 
the following variable-length separated aehievability seheme: the souree outeome m is first 
losslessly represented as a binary string of length Llog 2 m\ by assigning it to m-th binary string 
in { 0 , 0 , 1 , 00, 01,...} (the most likely outeome is represented by the empty string). Then, all 
binary strings are grouped aeeording to their eneoded lengths. A ehannel eodebook is generated 
for eaeh group of sequenees. The eneoded length is sent over the ehannel with high reliability, so 
the deeoder almost never makes an error in determining that length. Then the eneoder makes an 
ML deeision only between sequenees of that length. A formal deseription and an error analysis 
follow. 

Codebook: the eolleetion of M + [log MJ + 1 eodewords 


j = 1,2,... ,M 

(96) 

Gj, J = M + 1,..., M + [log2 MJ + 1 

(97) 


where {e^-, j = 1, 2,...} is an orthonormal basis of L 2 (M°°). 

Encoder: The eneoder sends the pair (m, [log 2 mJ) by transmitting + c^+pogamj+i- 
Decoder: Having reeeived the infinite string eorrupted by i.i.d. Gaussian noise z, the deeoder 
first (reliably) deeides between [log 2 MJ + 1 possible values of [log 2 mj based on the minimum 
distanee: 

i = argmin ||z - cm+j+iH, j = 0,..., [log 2 MJ (98) 

j 
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As shown in [41, p. 258]), [42], [30, Theorem 3], the probability of error of such a decision 
is given by e{E, [log 2 MJ +1). This accounts for the second term in (95). The decoder then 
decides between 2^ messages^^ j with [logij = 

c = argmin ||z — Cj\\, j = 2^,..., min{2^'’''^ — 1, M} (99) 

The probability of error of this decision rule is similarly upper bounded by e {E, m), provided 
that the value of Llog 2 mj was decoded correctly: i = [log 2 mJ. Since 2 L^°® 2 "*J < ui, this 
accounts for the first term in (95). 

■ 

Normally, one would choose 1 <C i ?2 ^ so that the second term in (95), which corresponds 
to the probability of decoding the length incorrectly, is negligible compared to the first term, 
and the total energy E ^ Ei. Moreover, if W takes values in a countably infinite alphabet, 
one can truncate it so that the tail is negligible with respect to the first term in (95). To ease 
the evaluation of the first term in (95), one might use i < In the equiprobable case, this 

weakening leads to E [e {Ei, W)] < e {Ei, M). 

If the power constraint is average rather than maximal, a straightforward extension of Theorem 
8 ensures the existence of an (i?, e) code (average power constraint) for the AWGN channel with 

e < E [£ (Ei(Llog 2 If^J), 14^)] + e{E 2 , [logMj + 1), (100) 

where Ei: {0,1,..., Llog 2 MJ} i-)- M+ and E 2 G M+ are such that 

E[E,{l\og2W\)] + E2<E. (101) 

V. Asymptotic expansions of the energy-distortion tradeoff 
A. Problem setup 

This section focuses on the energy-distortion tradeoff in the JSCC problem. Like in Section 
IV, we limit the total available transmitter energy E without any restriction on the (average) 
number of channel uses per source sample. Unlike Section IV, we allow general (not neccesarily 
discrete) sources, and we study the tradeoff between the source dimension k, the total energy E 
and the fidelity of reproduction. Thus, we identify the minimum energy compatible with target 

*^More precisely, 2^ messages if £ < [logj MJ — 1 and M — + 1 < 2 L*°S 2 n;jessages if 1 = [logj MJ. 
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distortion without any restriction on the time-bandwidth product (number of degrees of freedom). 
As in Section III-A, we consider both the average and the excess distortion criteria. 

Formally, we let the source be a /c-dimensional vector ^ . A (k, E, d, e) energy-limited 

code is an energy-limited code for with total energy E and probability < e of distortion 
exceeding d (see (49)). Similarly, a {k, E, d) energy-limited code is an energy-limited code for 
with total energy E and average distortion not exceeding d (see (51)). The goal of Section V 
is to characterize the minimum energy required to transmit k source samples at a given fidelity, 
i.e. to characterize the following fundamental limits: 

E*j{k, d) = {inf i?; 3 a (fc, i?, d) feedback code} , (102) 

Ej{k, d, e) = {inf E: 3 a. {k, E, d, e) feedback code} (103) 

as well as the corresponding limits E*{k,d) and E*{k,d,e) of the energy-limited non-feedback 
codes. 

B. Previous results on the energy-per-bit and the energy-distortion tradeojf 

If the source produces equiprobable binary strings of length k. Shannon [40] showed that 
the minimum energy per information bit to noise power spectral density ratio compatible with 
vanishing block error probability converges to 

^ log,2 = -1.59 dB (104) 

as A: —)■ oo, e —)■ 0. The fundamental limit in (104) holds regardless of whether feedback is 
available. Moreover, this fundamental limit is known to be the same regardless of whether the 
channel is subject to fading or whether the receiver is coherent or not [43]. Polyanskiy et al. 
refined (104) as [30, Theorem 3] 

E*{k,0,e)^^^ = k + ^/2khgeQ~^{e) -^\ogk + 0 (1) (105) 

for transmission without feedback, and as [30, Theorem 8] 

E}{kAe)^^ = il-e)k + Oil) (106) 

Nq 

for transmission with feedback. Moreover, [30, Theorem 7] (see also (88)) shows that in fact 

E^{k,0,0)^^ = k, (107) 

iVo 
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i.e. in the presence of full noiseless feedback, Shannon’s limit (104) can be achieved with equality 
already oX. k = 1 and e = 0. 

For the finite blocklength behavior of energy per bit in fading channels, see [44]. 

For the transmission of a memoryless source over the AWGN channel under an average 
distortion criterion, Jain et al. [29, Theorem 1] pointed out that as k ^ oo, 

E*{k, d) loge 


^ R{d). 


(108) 


k No 

Note that (108) still holds even if noiseless feedback is available. 

Unlike Polyanskiy et al. [30], we allow analog sources and arbitrary distortion criteria, and 
unlike Jain et al. [29], we are interested in a nonasymptotic analysis of the minimum energy per 
sample. 


C. Energy-limited feedback codes 

Our first result in this section is a refinement of (108). 

Theorem 9. Let the source and its distortion measure satisfy assumptions A1-A5. The minimum 
energy required to transmit k source symbols with average distortion < d over an AWGN channel 
with feedback satisfies 

E}{k, d) • ^ = kR{d) + 0(log k) (109) 

No 

Proof: Achievability. The expansion in (109) is achieved by the following separated source/channel 
scheme. For the source code, we use the code of Yang and Zhang [38] (abstract alphabet) that 
compresses the source down to M representation points with average distortion d such that 

log M = kR{d) + 0(log k). (110) 

For the channel code, we transmit the binary representation of M error-free using the optimal 
scheme of Polyanskiy et al. [30, Theorem 7], so that 

logM = ^loge. (Ill) 

iVo 

Converse. By data processing, similar to (90), 

kR{d) <^\oge. (112) 

iVo 
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Remark 2. For the transmission of a Gaussian source over the feedback AWGN channel, we 
have 

E}{k,d)= kR{d). (113) 

No 

Indeed, the Schalkwijk-Bluestein scheme [22], [39] attains (113) for k = 1. For k > 1, 
transmitting the Schalkwijk-Bluestein codewords corresponding to i-th source sample in time 
slots i,k + i,2k + i,... attains (113) exactly for all k = 1,2, _ 


Theorem 10. In the transmission of a source satisfying the assumptions A1-A5 over an AWGN 
channel with feedback, the minimum average energy required for the transmission of k source 
samples under the requirement that the probability of exceeding distortion d is no greater than 
0 < e < 1 satisfies, as k ^ oo. 


E~(k,d.f-T = v- ^)km - + o (log k) 


No 


27r 


(114) 


Proof: Achievability. Pair a lossy compressor —)■ VF with excess-distortion probability e 

and H{W) = with the achievability scheme in Theorem 7 and apply (89) and (62). 

Converse. Again, the converse result follows proceeding as in (90), invoking (62). ■ 

Comparing (114) and (72), we observe that, similar to the setup in Section III, allowing 
feedback and average power constraint reduces the asymptotically achievable minimum energy 
per sample by a factor of 1 — e. As in Section III, that limit is approached from below rather 
than from above, i.e. finite blocklength helps. 

Similar to the setup of Section III, naive separation achieves at most 

E*f{k, d, ^ (1 “ e)kR{d) + a\/k\ogk, a > 0. (115) 

No 

D. Energy-limited non-feedback codes 

Our next result generalizes [30, Theorem 3]. Loosely speaking, it shows that the energy E, 
probability of error e and distortion d of the best non-feedback code satisfy 

^ loge - kR{d) y kV{d) + ^ lo"^ • Q~\e). (116) 

Note that in (116) source and channel dispersions add up, as in the usual (non-feedback) joint 
source-channel coding problem [6], [8]. More precisely, we have the following: 
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Theorem 11. In the transmission of a stationary memoryless source (satisfying the assumptions 
A1-A5) over the AWGN channel, the minimum energy necessary for achieving probability 0 < 
e < 1 of exceeding distortion d satisfies, as k ^ oo, 

E* (k, d, e) ^ = kR{d) + s/k (2R(d) \oge+ V(d))Q-^ (e) + O (log k) (117) 
A'o 


Proof: 

Achievability: We let the total energy E be such that 


= kR(d) + \/k (2R(d) log e + V{d))Q ^ (e -+ blog A:, 

Ao \ s/kj 


(118) 


and we show that a > 0 and b can be chosen so that the excess distortion probability is bounded 
by e. 

We consider a good lossy code with M = exp{2kR{d)) representation points, so that the 
probability that the source is not represented within distortion d is exponentially small. We show 
that a combination of that code with the variable-length separated scheme in Theorem 8 achieves 
(118). First, we prove the following generalization of Theorem 8 to the lossy case: for any M, 
there exists an (k, E,d,e') code for the AWGN channel (without feedback) such that 


e' < E 
where 


e { El, 


e{E 2 , [logMj + 1) +E (1 - 


M 


4 {z^ e S^: d(s^^^) < d}. 


(119) 


( 120 ) 


Towards that end, let the representation points Zi,Z 2 ,, Zm be drawn i.i.d. from P|*. The 
source encoder goes down the list of the representation points and outputs the index of the first 
d-close match to S^\ 

VF = min{A d(5^Zi) <d} (121) 

(if there is no such index, it outputs 1). Averaged over Zi,..., Zm, the probability that no d- 
close match is found is upper bounded by the third term in (119) (e.g. [24, Theorem 10]). The 
index W is then transmitted over the channel using the scheme in Theorem 8, with the total 
probability of error averaged over all lossy codebooks given by 


e' < E [e (El, W)] + e{E 2 , [logMj + 1) + E 


(1 




( 122 ) 
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Since conditioned on S' = s, VF is geometrically distributed with success probability 
we have 

E \W\S^ = s^l = , , \ (123) 

Noting that e{E, m) is a concave function of m, we have by Jensen’s inequality 

W.[e{E,W)] < E [e (£;,E [fFlS”^])}] , (124) 

which gives the first term in (95). 

We proceed to show that with the choice of 


Ei = E — c\ogE, 


(125) 


for an appropriate c > 0, and M = exp{2kR{d)), the right side of (119) is upper bounded by e. 
A reasoning similar to [24, (108)-(111)] and the Cramer-Chernoff bound yield 


E 


[l - Pzk*{Bd{S^)))^ < exp(-fcai) 


for some oi > 0. On the other hand, (105) [30, Theorem 3] implies 

1 


£ (£', m) = P [log m > G{E)] + O 


y/k 


(126) 


(127) 


where G{E) = AA log e - i log ^ log^ ej 

Applying (126) and (127) to (95), we conclude that the excess-distortion probability is bounded 
above by 


P 


log 


Pzk.{Bd{Sf^)) 


> G{E-c\ogE) 


+ P [log (log M + 1) > G (c log E)] + O 


yfE 


(128) 


The second term in (128) can be made to decay as fast as O for large enough c. To 

evaluate the first term in (128), we recall [24, Lemma 2], which states that for k large enough. 


P 


log 


- '^Js{Si,d) + Glogk + co 


2=1 


> 1 - 


K 

y/k 


(129) 


Pzk.{Bd{S>^)) 

where cq and K are constants, and G is a constant explicitly identified in [24, Lemma 2]. Letting 
h = c + C — \ and applying (129) to upper-bound the first term in (128), we conclude by the 
Berry-Esseen Theorem that 


P 


log 


Pzk.{Bd{S^)) 


> G{E - c\ogE) 


<e__ + 0 _ 


y/k 


VkJ' 


(130) 
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Since a can be chosen so that (128) is upper bounded by e for k large enough, this concludes 
the proof of the achievability part. 

Converse: The result in [ 6 , Theorem 1] states that: 


e > sup < sup E 

7>o I Py 


,inf P d) - 2 x;y(x; Y) > 7 | 

x: xM<P 


exp (- 7 ) 


(131) 


where J 5 fc(s^, d) is the d-tilted information in G defined in (58), y) — log 

and Py|x=x and Py are specialized to 


dP^ 


(y). 


PY=n-^ 0’ 


k=l 

00 


No 


k=l 




(132) 

(133) 


*X;y(x;Y) = AA I^^Lloge, ^jj^loge^ (134) 

Next, we let in (131) 7 = i log Since jsk{S’^, d) = 3 s{Si, d) is a sum of independent 
random variables, the Berry-Esseen bound applies to the probability in (131), and the converse 
direction of (117) follows since ||x|| 2 < P. ■ 

If the maximal power constraint in (85) is relaxed to (87), then E* {k, d, e), the minimum 
average power required for transmitting k source samples over an AWGN channel with the 
probability of exceeding distortion d smaller than or equal to 0 < e < 1 satisfies, under 
assumptions A1-A5: 


e: {k, dp)^^ = Rid)il -e)k + 0 (v/H^) , (135) 

i.e. the asymptotically achievable minimum energy per sample is reduced a factor of 1 — e if a 
maximal power constraint is relaxed to an average one. This parallels the result in (78), which 
shows that variable-length coding over a channel reduces the asymptotic fundamental limit by 
a factor of 1 — e compared to fixed-length coding, even without feedback. 

Proof of (135).’ Observe that Theorem 10 ensures that a smaller average energy than that 
in (135) is not attainable even with full noiseless feedback. In the achievability direction, let 
(f*, g*) be the optimal variable-length source code achieving the probability of exceeding d equal 
to e' (see [5, Section III.B]). Denote by £(f*(s)) the length of f*(s). Let M be the size of that 
code. Set the energy to transmit the codeword of length £(f*(S'^)) to 


£(f*(P'=))iVolog,2 + yH^. 


(136) 
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As shown in [5], E [£(f*(S'))] is equal to the right side of (72) (with e replaeed by e'). Choosing 
e' = e — ^ for some a, we eonelude that indeed the average energy satisfies (135). Moreover, 
(127) implies that the expression inside the expeetation in (100) is O It follows that for 

a large enough a, the exeess distortion probability is bounded by e. ■ 

VI. Conclusion 

We have eonsidered several seenarios for joint souree-ehannel eoding with and without feed- 
baek. Our main eonelusions are: 

1) The average delay vs. distortion tradeoff with feedbaek, as well as the average energy 
vs. distortion tradeoff with feedbaek, is governed by the ehannel eapaeity, and the souree 
rate-distortion and rate-dispersion funetions. In partieular, the ehannel dispersion plays no 
role. 

2) In variable-length eoding with feedbaek, the asymptotieally aehievable minimum average 
length is redueed by a faetor of 1 — e, where e is the exeess distortion probability. This 
asymptotie fundamental limit is approaehed from below, i.e., eounter-intuitively, smaller 
souree bloeklengths may lead to smaller attainable average delays. 

3) Introdueing a termination symbol that is always deeoded error-free allows for transmission 
over noisy ehannels with guaranteed distortion. 

4) Variable-length transmission without feedbaek still improves the asymptotie fundamental 
limit by a faetor of 1 — e, where e is the exeess distortion probability. 

5) In all the eases we have analyzed the approaeh to the fundamental limits is very fast: 

O where k is the souree bloeklength. This behavior is attained, under average 

distortion, by a separated seheme with stop-feedbaek. 

6 ) The setting of a wideband Gaussian ehannel with an energy eonstraint exhibits many 
interesting parallels with the variable-length eoding setting. Allowing a non-vanishing exeess 
distortion probability e and an average (rather than maximal) energy eonstraint reduees the 
asymptotieally aehievable minimum energy by a faetor of 1 — e. In the presenee of feedbaek, 
just as in the variable-length eoding, this asymptotie fundamental limit is approaehed from 
below. 

7) Error-free transmission of a random variable W over the AWGN ehannel with ideal feed¬ 
baek, with almost optimal energy eonsumption, is possible via a variable-length separated 
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scheme. 

8 ) More generally, variable-length separated sehemes perform remarkably well in all eonsid- 
ered seenarios. 
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